Finite Sample Error Bound for Parzen Windows
نویسندگان
چکیده
Parzen Windows as a nonparametric method has been applied to a variety of density estimation as well as classification problems. Similar to nearest neighbor methods, Parzen Windows does not involve learning. While it converges to true but unknown probability densities in the asymptotic limit, there is a lack of theoretical analysis on its performance with finite samples. In this paper we establish a finite sample error bound for Parzen Windows. We first show that Parzen Windows is an approximation to regularized least squares (RLS) methods that have been well studied in statistical learning theory. We then derive the finite sample error bound for Parzen Windows, and discuss the properties of the error bound and its relationship to the error bound for RLS. This analysis provides interesting insight to Parzen Windows as well as the nearest neighbor method from the point of view of learning theory. Finally, we provide empirical results on the performance of Parzen Windows and other methods such as nearest neighbors, RLS and SVMs on a number of real data sets. These results corroborate well our theoretical analysis. Introduction In machine learning, nonparametric methods such as Parzen Windows and nearest neighbor methods for descriptive as well as predictive modeling are widely used and well studied. In order to build a predictive model, class conditional probabilities are estimated from the sample data using these methods, and then the decision is made by choosing the class having the maximum class probability. Parzen Windows and nearest neighbors are in fact closely related (Duda, Hart, & Stork 2000), and are different in choosing “window” functions, thus local regions. Both of these methods have several attractive properties. They are easy to program–no optimization or training is required. Their performance can be very good on some problems, comparing favorably with alternative, more sophisticated methods such as neural networks. They allow an easy application of a reject option, where a decision is deferred if one is not sufficiently confident about the predicted class. ∗This work was supported in part by Louisiana BOR Grant LEQSF(2002-05)-RD-A-29 and ARO Grant DAAD19-03-C-0111. Copyright c © 2005, American Association for Artificial Intelligence (www.aaai.org). All rights reserved. It is well known that in the asymptotic limit, the probability estimates by these techniques converge to true (unknown) probabilities. Thus the classifier based on the estimated probabilities will converge to the optimal Bayes decision rule. Also, a well-known observation is that at least half classification information in an infinite data set resides in the nearest neighbor. However, when the sample size is finite, little systematic work has been carried out so far on the performance analysis for these methods. Known error bounds for these methods are typically obtained empirically (Mitchell 1997). While empirical analysis can be justified statistically, it only provides limited insights into algorithms’ performance. Statistical learning theory (Vapnik 1998; Cucker & Smale 2001) provides a framework for analyzing error bounds for predictive models given a finite sample data set. A good predictive model is the one that minimizes empirical errors on the sample data, while controlling the complexity of the model. The error bound for such a model thus usually contains two parts: the empirical error and the approximation error. In this paper, we first derive the Parzen Windows method as an approximation to RLS under appropriate conditions. We then establish a performance bound for the Parzen Windows method based on the error bound for RLS given finite samples. Thus, our contributions are: (1) Demonstrating the Parzen Windows method as an approximation to RLS; (2) Estimating an error bound for the Parzen Windows classifier and discussing when the method will not perform well. We also provide some indirect theoretical insight into the nearest neighbor technique and demonstrate their performance using a number of data sets. Parzen Windows Parzen Windows (Duda, Hart, & Stork 2000; Fukunaga 1990; Parzen 1962) is a technique for density estimation that can be used for classification as well. Using a kernel function, it approximates a given training data distribution via a linear combination of the kernels centered on the training points. Here, each class density is approximated separately and a test point is assigned to the class having maximal (estimated) class probability. Let fp(x) = ∑ yik(xi, x) (1)
منابع مشابه
High order Parzen windows and randomized sampling
In the thesis, high order Parzen windows are studied for understanding some algorithms in learning theory and randomized sampling in multivariate approximation. Our ideas are from Parzen window method for density estimation and sampling theory. First, we define basic window functions to construct our high order Parzen windows. We derived learning rates for the least-square regression and densit...
متن کاملA Comparative Study of Various Probability Density Estimation Methods for Data Analysis
Probability density estimation (PDF) is a task of primary importance in many contexts, including Bayesian learning and novelty detection. Despite the wide variety of methods at disposal to estimate PDF, only a few of them are widely used in practice by data analysts. Among the most used methods are the histograms, Parzen windows, vector quantization based Parzen, and finite Gaussian mixtures. T...
متن کاملNonparametric Bayes-risk estimation
Absrract-Two nonparametric methods to estimate the Bayes risk using classified sample sets are described and compared. The first method uses the nearest neighbor error rate as an estimate to bound the Bayes risk. The second method estimates the Bayes decision regions by applying Parzen probability-density function estimates and counts errors made using these regions. This estimate is shown to b...
متن کاملRegression based Bandwidth Selection for Segmentation using Parzen Windows
We consider the problem of segmentation of images that can be modelled as piecewise continuous signals having unknown, non-stationary statistics. We propose a solution to this problem which first uses a regression framework to estimate the image PDF, and then mean-shift to find the modes of this PDF. The segmentation follows from mode identification wherein pixel clusters or image segments are ...
متن کاملComparing performance of k-Nearest Neighbors, Parzen Windows and SVM Machine Learning Classifiers on QSAR Biodegradation Data across Multiple Dimensions
Machine learning and pattern recognition are the most popular artificial intelligence techniques to model systems, those can learn from data. These techniques efficiently help in Classification, Regression, Clustering and Anomaly detection etc. k-Nearest Neighbors, Parzen Windows and Support Vector Machine (SVM) are some of the widely used Machine Learning classification techniques. This projec...
متن کامل